deBGA: read alignment with de Bruijn graph-based seed and extension

نویسندگان

Bo Liu

Hongzhe Guo

Michael Brudno

Yadong Wang

چکیده

MOTIVATION As high-throughput sequencing (HTS) technology becomes ubiquitous and the volume of data continues to rise, HTS read alignment is becoming increasingly rate-limiting, which keeps pressing the development of novel read alignment approaches. Moreover, promising novel applications of HTS technology require aligning reads to multiple genomes instead of a single reference; however, it is still not viable for the state-of-the-art aligners to align large numbers of reads to multiple genomes. RESULTS We propose de Bruijn Graph-based Aligner (deBGA), an innovative graph-based seed-and-extension algorithm to align HTS reads to a reference genome that is organized and indexed using a de Bruijn graph. With its well-handling of repeats, deBGA is substantially faster than state-of-the-art approaches while maintaining similar or higher sensitivity and accuracy. This makes it particularly well-suited to handle the rapidly growing volumes of sequencing data. Furthermore, it provides a promising solution for aligning reads to multiple genomes and graph-based references in HTS applications. AVAILABILITY AND IMPLEMENTATION deBGA is available at: https://github.com/hitbc/deBGA CONTACT: [email protected] information: Supplementary data are available at Bioinformatics online.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EPGA: de novo assembly using the distributions of reads and insert size

MOTIVATION In genome assembly, the primary issue is how to determine upstream and downstream sequence regions of sequence seeds for constructing long contigs or scaffolds. When extending one sequence seed, repetitive regions in the genome always cause multiple feasible extension candidates which increase the difficulty of genome assembly. The universally accepted solution is choosing one based ...

متن کامل

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

deBGR: an efficient and near-exact representation of the weighted de Bruijn graph

Motivation Almost all de novo short-read genome and transcriptome assemblers start by building a representation of the de Bruijn Graph of the reads they are given as input. Even when other approaches are used for subsequent assembly (e.g. when one is using 'long read' technologies like those offered by PacBio or Oxford Nanopore), efficient k -mer processing is still crucial for accurate assembl...

متن کامل

Sequence analysis EPGA: de novo assembly using the distributions of reads and insert size

Motivation: In genome assembly, the primary issue is how to determine upstream and downstream sequence regions of sequence seeds for constructing long contigs or scaffolds. When extending one sequence seed, repetitive regions in the genome always cause multiple feasible extension candidates which increase the difficulty of genome assembly. The universally accepted solution is choosing one based...

متن کامل

De Bruijn Graph based De novo Genome Assembly

The Next Generation Sequencing (NGS) is an important process which assures inexpensive organization of vast size of raw sequence data set over any traditional sequencing systems or methods. Various aspects of NGS like template preparation, sequencing imaging and genome alignment and assembly outlines the genome sequencing and alignment .Consequently, deBruijn Graph (dBG) is an important mathema...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Bioinformatics

دوره 32 21 شماره

صفحات -

تاریخ انتشار 2016

deBGA: read alignment with de Bruijn graph-based seed and extension

نویسندگان

چکیده

منابع مشابه

EPGA: de novo assembly using the distributions of reads and insert size

Clustering of Short Read Sequences for de novo Transcriptome Assembly

deBGR: an efficient and near-exact representation of the weighted de Bruijn graph

Sequence analysis EPGA: de novo assembly using the distributions of reads and insert size

De Bruijn Graph based De novo Genome Assembly

عنوان ژورنال:

اشتراک گذاری